Massively Parallel Analysis of Similarity Matrices on Heterogeneous Hardware
نویسندگان
چکیده
We conduct a study that investigates the performance characteristics of a set of parallel implementations of the recurrence quantification analysis (RQA) using OpenCL. Being an important tool in climate impact and medical research, a central aspect of RQA is the construction of a binary matrix that captures the similarities of multi-dimensional vectors. Based on this matrix, quantitative measures are derived. Starting with a baseline implementation, we diversify its properties along four dimensions: the representation of input data, the materialisation of the similarity matrix, the representation of similarity values and the recycling of intermediate results. We evaluate the performance of five implementations by varying the input parameter assignments, the hardware platform employed for execution and the default OpenCL compiler optimisations status. We come to the conclusion that the performance of conducting RQA highly depends on the selected implementation as well as the combination of these variables under investigation. Differences in runtime of up to one order of magnitude are observed, emphasising the importance of performance studies as presented here.
منابع مشابه
Investigating the Effects of Hardware Parameters on Power Consumptions in SPMV Algorithms on Graphics Processing Units (GPUs)
Although Sparse matrix-vector multiplication (SPMVs) algorithms are simple, they include important parts of Linear Algebra algorithms in Mathematics and Physics areas. As these algorithms can be run in parallel, Graphics Processing Units (GPUs) has been considered as one of the best candidates to run these algorithms. In the recent years, power consumption has been considered as one of the metr...
متن کاملAlgorithm-based Fault Tolerance for Floating-point Operations in Massively Parallel Systems
This paper considers the applicability of algorithm-based fault tolerance (ABFT) to massively parallel scientiic computation. Existing ABFT schemes can provide eeective fault tolerance at a low cost for computation on matrices of moderate size; however, the methods do not scale well to oating-point operations on large systems. This paper proposes the use of a partitioned linear encoding scheme ...
متن کاملLightweight 4x4 MDS Matrices for Hardware-Oriented Cryptographic Primitives
Linear diffusion layer is an important part of lightweight block ciphers and hash functions. This paper presents an efficient class of lightweight 4x4 MDS matrices such that the implementation cost of them and their corresponding inverses are equal. The main target of the paper is hardware oriented cryptographic primitives and the implementation cost is measured in terms of the required number ...
متن کاملScalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms
We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous...
متن کاملParallel computing using MPI and OpenMP on self-configured platform, UMZHPC.
Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015